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(54) Content-based digital-image classification method 

(57) The classification method involves the follow- 
ing steps: defining a set of low-level features describing 
the semantic content of the image, said features being 
quantities obtainable from the image by means of logi- 
co-mathematlcal expressions that are known before- 
hand, and the choice of said features depending upon 
the Image classes used for the classification; indexing 
an image to be classified, with the purpose of extracting 
therefrom a feature vector, the components of which 
consist of the values assumed, In the image, by said low- 
level features; splitting the feature space defined by the 
low-level features into a plurality of classification re- 
gions, to each one of said regions there being associat- 
ed a respective image class, and each classification re- 
gion being the locus of the points of the feature space 
defined by a finite set of conditions laid on at least one 
component of the feature vector; associating the feature 
vector to the feature space; identifying, among the clas- 
sification regions, a specific classification region con- 
taining the feature vector extracted from the image to 
be classified; and identifying the image class associated 
to the specific classification region identified. 



10 



20 



30 



40 



50 





r 


DEFINITION OF SET 
OF FEATimES 






IMAGE 
INDEXING 






SPLITTING OF 
1 FEATURE SPACE 







IDENTIFICATION OF 
CLASSIFIC. REGION 



IDENTIFICATION 
OF IMAGE CLASS 




Q. 

LU 



FIG. 1 



Printed by Jouve. 75001 PARIS (PR) 



NSOOCIO: <EP_ 



.11021 eOA1J_> 



EP1 102 180 A1 



Description 

[0001] The present invention regards a content- 
based digital-image classification method. 
[0002] In particular, the present invention finds an ad- 
vantageous, but not exclusive, application in the classi- 
fication of images according to the following three class- 
es: photographs, texts, and graphics. Consequently, the 
ensuing treatment will refer to these classes, without this 
implying any loss of generality. 

[0003] The present invention moreover finds advan- 
tageous application in the classification of photographs 
according to the following three classes: outdoor, indoor, 
and close-ups; in the classification of outdoor photo- 
graphs, according to the following four classes: land- 
scapes, buildings, actions, and people; in the classifica- 
tion of indoor photographs, according to the following 
two classes: presence and absence of people; in the 
classification of close-ups, according to the following 
two classes: portraits and objects; in the classification 
of graphics, according to the following three classes: clip 
art, graphic illustrations (photorealistic graphics), and 
busyness graphics (tables and charts); and in the clas- 
sification of texts, according to the following two classes: 
black and white, and colour. 

[0004] As is known. Internet and the Web have be- 
come the key enablers which have motivated and ren- 
dered possible the revolution in the management of all 
the steps necessary for the use of images in digital for- 
mat, i.e., the so-called "imaging workflow". This emerg- 
ing workflow structure depends upon the effective im- 
plementation of three fundamentals steps: image acqui- 
sition, the so-called "digital way-in"; image re-utilisation, 
the so-called "digital recirculation"; and cross-device im- 
age rendering, the so-called "digital way-out", i.e., the 
rendering of the images among heterogeneous devices 
(monitor, printer, etc.), in particular, the processing of the 
images for a specific purpose, such as printing or filing. 
[0005] A content-based digital-image classification 
has by now become an indispensable need for an ac- 
curate description and use of digital images, particularly 
for the adoption of the most suitable image-processing 
strategies for satisfying the ever-increasing demand for 
quality of image, speed of transmission, and ease of use 
in Internet-based applications, such as improvement of 
digital images, i.e., the so-called "image enhancement", 
colour-processing, and image compression. 
[0006] At present, one of the methodologies used for 
content-based digital-image classification is essentially 
based on an approach of a heuristic type, implemented 
by means of expert systems. In other words, this meth- 
odology basically involves determination of the content 
of the image by analysing the digital image in regions of 
variable size according to directions and pre-set scan- 
ning rules using an algorithm of the type "If... then... 
else", i.e.. by evaluating the meaning of the region of 
interest in the light of the characteristics of the preceding 
or adjacent regions, as well as by the verification of a 



2 

structured sequence of membership conditions with one 
or more rules. 

[0007] Although widely used, the above methodology 
presents a number of drawbacks. The first drawback is 

5 represented by the computational complexity required 
for analysis of the high number of pixels of an image, 
along with the other evident drawbacks in terms of time 
and cost associated thereto. The second drawback is 
represented by the extremely complex optimisation that 

10 this methodology may be subject to. The third drawback 
is represented by the substantial impossibility of opti- 
mising analysis using parallel architectures. The fourth 
drawback is due to the not extremely high intrinsic "ro- 
bustness" of the methodology, caused by the unavoid- 

15 able possibility of not considering, in the above-men- 
tioned "if... then., else" algorithm, particular cases that 
may arise in images. 

[0008] The aim of the present invention is to provide 
a content-based digital-image classification method free 
20 from the drawbacks of the known methods. 

[0009] According to the present invention, a content- 
based digital-image classification method is provided, 
as defined in claim 1 . 

[0010] For a better understanding of the present in- 
25 vention, a preferred embodiment thereof is now de- 
scribed, simply to provide a non-limtting example, with 
reference to the attached drawings, in which: 

Figure 1 shows a flowchart relative to the digital- 
30 image classification method according to the 
present invention; and 

Figure 2 shows a flow chart relative to the construc- 
tion of a binary tree-structured classifier used in the 
present classification method. 

35 

[001 1] First of all it should be emphasized that in what 
follows the term "images" indicates not only the com- 
plete images, but also the subimages obtained by divid- 
ing an image up (image splitting). 
40 [0012] Figure 1 shows a flowchart relative to the dig- 
ital-image classification method according to the 
present invention. 

[0013] According to what is illustrated in figure 1 . the 
present classification method involves the following 
45 steps: 

defining a set of N low-level features, which, taken 
together, describe* the semantic content of an im- 
age, and which consist of quantities that can be ob- 

so tained from the image by means of logico-mathe- 
matical expressions that are known beforehand, 
and their choice depends upon the image classes 
used for the classification (block 10); 
indexing the Image to be classified, with the pur- 

55 pose of extracting therefrom a feature vector X = 
[X^, X2 Xf^] formed by the values assunned, in 
said image, by the N low-level features (block 20); 
and 
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processing, in the way described in greater detail 
hereinafter, the feature vector X according to a 
processing algorithnn so as to identify the class of 
the image (blocks 30-50). 

5 

[0014] In particular, the choice of the low-level fea- 
tures is an essential factor for a good classification of 
the image on the basis of its pictorial content. The fol- 
lowing criteria of choice have guided the systematic 
study carried out by the applicant with the purpose of io 
determining the features of the image that are best suit- 
ed for describing the content of the image in terms of 
colour, contrast, and fonm (see also the following publi- 
cations: 1 ) P. Clocca and R. Schettini, "A relevance feed- 
back mechanism for content-based image retrieval", In- is 
formation Processing and Management 35, pp. 
605-632, 1999; and 2) I. Gagliardi and R. Schettini, "A 
method for the automatic indexing of color images for 
effective image retrieval", The New Review of Hyperme- 
dia and Multimedia 3, pp. 201 -224, 1 997): 20 

discrimination power (the feature has a small vari- 
ance within each class, and the distances between 
its mean values in different classes are high); and 
efficiency (the feature may be rapidly processed). 

[001 5] The study carried out by the applicant using the 
criteria of choice referred to above has led to the iden- 
tification of the low-level features listed hereinafter. 
which,.according to an aspect of the present invention, 30 
constitute a sort of library of features, from among which 
are chosen, according to the classes of image amongst 
which it'Js aimed to carry out the classification, the N 
low-levet features used for indexing the image: 

35 

a) the colour histogram in the 64-colour quantized 
hue saturation value (HSV) colour space; 

b) the colour coherence vectors (CCVs) In the 

64- colour quantized HSV colour space; the buckets 

) colour pixels are defined as coherent or incoherent 40 

according to whether they belong or not to similariy 
coloured regions (i.e., regions of one and the same 
colour) having a size greaterthan a threshold value; 
for further details see, for example, G. Pass, R. 2a- 
bih, and J. Miller, "Comparing Images Using Color 45 
Coherence Vectors", ACM Multimedia 96, pp. 

65- 73, 1996; 

c) the 11 -colour quantized colour-transition histo- 
gram in the HSV colour space (in particular, red, or- 
ange, yellow, green, blue, purple, pink, brown, so 
black, grey, and white); for further details, see, for 
example, I. Gagliardi and R. Schettini, "A method 

for the automatic indexing of color Images for effec- 
tive image retrieval", The New Review of Hyperme- 
dia and Multimedia 3, pp. 201-224, 1997); 55 

d) the moments of inertia of colour distribution in the 
non-quantized HSV colour space; for further de- 
tails, see, for example. M.A. Strieker and M. Oren- 



go, "Similarity of Color Images", Paper presented 
at the SPIE Storage and Retrieval for Image and 
Video Data-Bases 111 Conference, 1995; 

e) the moments of inertia (mean value, variance, 
and skewness) and the kurtosis of the luminance of 
the image; 

f) the percentage of non-coloured pixels in the im- 
age; 

g) the number of colours of the image in the 64-col- 
our quantized HSV colour space; 

h) the statistical information on the edges of the im- 
age extracted by means of Canny's algorithm; in 
particular: 

hi) the percentage of low, medium and high 
contrast edge pixels in the image; 
h2) the parametric thresholds on the gradient 
strength corresponding to medium and high- 
contrast edges; 

h3) the number of connected regions Identified 
by closed high-contrast contours; and 
h4) the percentage of medium-contrast edge 
pixels connected to high-contrast edges; 

i) the histogram of the directions of the edges ex- 
tracted by means of the Canny's edge detector (15 
bars or gaps, each having an angular width of 12*. 
have been used to represent the histogram); for fur- 
ther details, see, for example, R Ciocca and R. 
Schettini, "A relevance feedback mechanism for 
content-based image retrieval". Information 
Processing and Management 35, pp. 605-632, 
1999; 

j) the mean value and the variance of the absolute 
values of the coefficients of the subimages of the 
first three levels of the multi-resolution Daubechies 
wavelet transfonri of the luminance of the image; 
for further details, see, for example, P. Scheunders, 
S. Livens, G. Van de Wouwer, P. Vautrot, and D. Van 
Dyke, "Wavelet-based Texture Analysis", Interna- 
tional Joumal of Computer Science and Information 
Management. 1997; 

k) the estimation of the texture characteristics of the 
image based on the neighbourhood grey-tone dif- 
ference matrix (NGTDM), in particular coarseness, 
contrast, busyness, complexity, and strength (a 
quantity used in image-texture analysis); for further 
details, see, for example, the following: 1) M.. Ama- 
dasun and R. King, "Textural features correspond- 
ing to textural properties", IEEE Transaction on 
System, Man and Cybernetics 19, pp. 1264-1274, 
1989; and 2) H. Tamura. S. Mori, andT. Yamawaki, 
Textural features corresponding to visual percep- 
tion", IEEE Transaction on System. Man and Cyber- 
netics 8, pp. 460-473, 1978); 
I) the spatial-chromatic histogram of the colour re- 
gions identified by means of the 11 -colour quanti- 
zation process in the HSV colour space (for further 
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details, see, for example, the above-mentioned 
publication "A relevance feedback mechanism for 
content-based image retrieval") , and in particular: 

M) the co-ordinates of the centroid of the col- 
ours; and 

12) the dispersion of the colour regions (i.e. , pix- 
el regions of the same colour) with respect to 
their centroids; 

m) the spatial composition of the colour regions 
Identified by means of the 11 -colour quantization 
process (for further details, see the above-men- 
tioned publication "A relevance feedback mecha- 
nism for content-based image retrieval"), and in par- 
ticular: 

ml ) the fragmentation (the number of colour re- 
gions); 

m2) the distribution of the colour regions with 
respect to the centre of the image; and 
m3) the distribution of the colour regions with 
respect to the x-axis and with respect to the y- 
axis. 

[001 6] As may be noted, the total number of features 
is relatively high (389) - but not necessarily constrained 
- given that a number of direction and colour histograms 
are used of intrinsically large size. However, the ex- 
tremely different nature of the features enables reduc- 
tion of the risk of classifying in the same class Images 
that are very different from one another. 
[0017] As mentioned previously, following upon in- 
dexing of the image to be classified, the feature vector 
X is processed according to a processing algorithm with 
the purpose of identifying the class of the image. 
[0018] In particular, processing of the feature vector 
X involves the following steps: 

splitting the feature space (vector space), defined 
by the N features selected, into a finite number of 
classification regions, to each of which is associat- 
ed a respective image class, and each of which is 
the locus of the points of the feature space defined 
by a finite set of conditions laid on one or more com- 
ponents of the feature vector X, or in other words, 
the locus of the points of the feature space in which 
the values assumed by one or more components of 
the feature vector X satisfy predetermined relations 
with respective threshold values (block 30); 
associating the feature vector X extracted from the 
image to be classified to a feature space, and then 
Identifying, amongst the various classification re- 
gions into which the feature space is split, a specific 
classification region containing the feature vector X 
(block 40); and 

identifying the image class associated to the spe- 
cific classification region identified (block 50), the 



image class undergoing classifk:ation thus being 
the one associated to the specific classification re- 
gion identified. 

[001 9] The classification methodology described with 
reference to blocks 30-50 is in practice implemented by 
using a binary-tree structured classifier, which is con- 
veniently constructed according to the known Cart 
methodology; for a detailed treatment of this methodol- 
ogy, the reader is referred to the following texts: 

1) L. Breiman, J.H. Friedman, R.A. Olshen, and C. 
J. Stone, "Classification and Regression Trees", 
Wadsworth and Brooks/Cole, Pacific Grove, Cali- 
fornia, 1984; and 

2) B.D. Ripley, "Pattern Recognition and Neural 
Networks", Cambridge University Press, Cam- 
bridge, 1996. 

[0020] The Cart methodology has been chosen in that 
it enables management of any combination of features 
selected from among the aforementioned list and the co- 
existence of different relations between the features in 
different classification regions of the image-characteris- 
tics space. 

[0021] In addition, the Cart methodology provides a 
clear characterisation of the conditions that control the 
classification, i.e., the conditions that determine when 
an image belongs to one given class of images rather 
than to another. 

[0022] The procedure for construction of the tree- 
structured classifier basically involves carrying out a re- 
cursive binary partition of the feature space according 
to a predetemnined binary partition criterion, from which 
the above-mentioned conditions that control splitting of 
the feature space into classification regions are drawn. 
[0023] In particular, with reference to Figure 2, the 
construction of the binary tree-structured classifier In- 
volves the following steps: 

defining a set of training images which comprises, 
for each image class, a plurality of images having 
different characteristics (block 70); 
indexing each of the training images to extract, from 
each one of them, a respective feature vector, the 
components of which are represented by the values 
assumed, for said training image, by the above- 
mentioned N low-level features (block 80); and 
constructing the aforesaid binary tree-structured 
classifier by perfomning a recursive binary partition 
procedure based on the feature vectors extracted 
from the training images and on a pre-set partition 
criterion (block 90). 



55 [0024] In the terminology proper to the processes of 
construction of the trees, the feature space defined by 
the N low-level features is the route node, whilst the var- 
ious classification regions are the terminal nodes, or 
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leaves, of the tree, each of which is labelled with a cor- 
responding image class. 

[0025] Classification of an image is in practice per- 
formed by supplying at input to the binary tree-struc- 
tured classifier the feature vector X of an image, then 
traversing the classifier until a terminal node is reached 
and associating to the image undergoing classification 
the image class made up of the label attached to the 
terminal node in which the feature vector X has finished. 
[0026] The procedure for const ruction of a binary tree- 
structured classifier is substantially defined by three 
rules: .:• 

the node-splitting criterion adopted; 

the construction-procedure termination criterion; 

and 

the terminal-node labelling criterion. 

[0027] In particular: 

the splitting criterion is such as to render the two 
descendant nodes derived from a parent node as 
internally homogeneous as possible in terms of 
types of images contained therein; 
the construction-procedure termination criterion is 
defined by the achievement of a minimum size of 
the nodes (i.e., achieving of a minimum number of 
images within each node); and 
the terminal-node labelling criterion is such as to 
minimise the image misclassification likelihood, i.e., 
to.rninimise the expected costs deriving from a im- 
age misclassification. 

[0028] In detail, the terminal-node labelling criterion 
involves assigning to each node of the tree-structured 
classifier, whether this be a terminal node or an inter- 
mediate node, the following properties: 

a label L. i.e., the name of the class of images which 
is chosen among a set of J predefined labels such 
as to minimise the expected image misclassification 
cost relative to the node, which is described in 
greater detail in what follows; 
a cardinality of the node, i.e., the total number of 
training images which, during the construction of 
the classifier, have reached the node; in mathemat- 
ical fonri, the cardinality of the node may be ex- 
pressed as follows: 
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a probability distribution of the labels in the node, i. 
e., for each image class the ratio between the 
number of training images which, traversing the 
classifier, have reached the node, and the cardinal- 
ity of the node; in mathematical form, the probability 
distribution of the labels in the node may be ex- 
pressed as 



where: 



20 - 



25 
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a image misclassification cost relative to the node, 
which is indicative of the reliability of the classifier 
conditioned at the node; for example, the image 
misclassification cost relative to the node, designat- 
ed as MC, may be defined by the following formula, 
which represents the ratio between the number of 
misclassified images In the node and the cardinality 
of the node: 



i^\j*L Size 



in which A/y is the number of training images 
belonging to the J image class, which, during con- 
struction of the classifier, have reached the node; 



[0029] It is pointed out that the expected image mis- 
classification cost relative to the node coincides with the 
expected probability of misclassification of the images 
that have reached the node in the case where the image 
misclassification costs corresponding to the individual 
image classes coincide with one another 
[0030] It is moreover emphasized that numerous oth- 
er formulas may be used to define an expected image 
misclassification cost relative to a node, these fonnulas 
being in any case required to provide an indication re- 
garding the classifier reliability conditioned at the node. 
[0031] In addition, the properties described previously 
can be assigned indifferently to all the nodes of the tree- 
structured classifier, or else only to the temriinal nodes. 
[0032] Finally, it is pointed out that the probability dis- 
tribution of the labels in a node, the cardinality of the 
node, and the image misclassification cost relative to the 
node could be determined using a set of training images 
different from the one used for constructing the tree- 
structured classifier. 

[0033] In addition, according to a further aspect of the 
present invention, once construction of the tree-struc- 
tured classifier.ls completed and once the properties list- 
ed above have been assigned to its nodes, a procedure 
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for validation of the labelling of the terminal nodes of the 
classifier is carried out. 

[0034] In particular, the labelling validation procedure 
involves considering, one at a time, the terminal nodes 
of the classifier and, for each one of these, comparing 
its cardinality with a threshold cardinality, and its mis- 
classification cost with a threshold misclassification 
cost. 

[0035] Should the cardinality associated to the node 
considered be less than the threshold cardinality, or the 
misclassification cost associated to the node be greater 
than the threshold misclassification cost, then a reject- 
ed-images class is associated to the node considered, 
and consequently the label assigned to it during con- 
struction of the classifier is eliminated, and a "rejected 
images" label is assigned to that node; otherwise, the 
label assigned to the node during construction of the 
classifier is confirmed. 

[0036] In general, in fact, in a problem of classification 
understood as assignment of objects to defined classes, 
it is convenient to introduce a further class of images 
called "rejected-images class", to which may be as- 
signed the images that the classifier used classifies with 
a level of "reliability" which is not considered acceptable 
for the problem in question. 

[0037] In a problem of image classification in which 
the image classes are those of photographs, graphics 
and texts, it is quite foreseeable, for instance, that in the 
rejected- images class there may finish up photographs 
of graphics, a few illustrations, and/or composite imag- 
es, I.e., images deriving from the combination of images, 
each one of which belonging to one of the image classes 
envisaged. 

[0038] The images that have reached a temiinal node 
to which the "rejected images" label is assigned, if nec- 
essary, may anyway subsequently be classified so that, 
in the application considered, the consequence of a pos- 
sible wrong assignment causes as little harm as possi- 
ble, or else an ad A)oc strategy could be applied to them, 
such as. for example in the case of coniposite images, 
segmentation. 

[0039] In addition, the tree-structured classification 
methodology enables definition of the set of the condi- 
tions on the values of the low-level features which an 
image must satisfy for it to be assigned to the rejected- 
images class, this in practice being defined by the join- 
ing of those terminal nodes to which the "rejected imag- 
es" label is assigned. 

[0040] It is moreover emphasized that, in the terminal 
node labelling validation procedure, the decision on 
whether or not to validate the label of the temiinal node 
may also be taken on the basis of just one of the two 
properties described; i.e., itmaybe taken by considering 
just the cardinality of the node, or else just the misclas- 
sification cost associated thereto. 
[0041] As regards the criterion of splitting of the 
nodes, one of the key problems is how to define the 
goodness of the split. The most widely used approach 



10 

is to select the split that causes the data contained In 
the descendant nodes to be more "homogenous" than 
the data contained in the parent node. A function that 
defines a measure of the goodness of the split is the 
5 "impurity of the nodes" function, which in practice meas- 
ures the "disorder" of the image classes within the node, 
and the smaller the impurity of a node, the greater the 
goodness of the split. 

[0042] In other words, to carry out splitting of a node, 
^0 first of all a plurality of possible splits are generated by 
imposing a finite set of conditions on each component 
of the feature vector, and, among the various splitting 
possibilities, the one that maximises the difference be- 
tween the impurity of the parent node and the impurity 
15 of the descendant nodes is chosen. 

[0043] Another function that may be used to measure 
the goodness of a splitting of a node is the reduction in 
deviance; for a more detailed treatment, the reader is 
referred to the following texts: 

20 

1) L.A. Clark and D. Pregibon, 'Tree-based mod- 
els", in Statistical Models in S, J.M. Chambers and 
T.J. Hastie (eds.). pp. 377-419, Chapman and Hall, 
London, 1992; and 
25 2) P. McCullagh and J.A. Nelder, "Generalized Lin- 
ear Models", Chapman and Hall, London, 1989. 

[0044] In general, tree-structured classifiers may be 
very big and overioaded with data, even though they^de- 

30 fine poor models of the structure of the problem. One'of 
the significant advantages of the Cart methodology is 
that "the explanatory tree" originally obtained may be 
pruned, and the pruning procedure produces a se- 
quence of subtrees, the performance of each one of 

35 these subtrees, in terms of misclassification likelihood, 
or of expected misclassification costs, being evaluated 
on the basis of sets of test images not present in the set 
of training images, or else by means of the so-called 
"cross validation approach" applied to the set of training 

40 images. 

[0045] The use of the best trees of the sequence of 
pruned trees as classifiers, instead of the explanatory 
trees, yields more parsimonious classifiers and reduces 
the marked dependence of the predictions upon the set 

45 of training images. 

[0046] The present classification method has been 
subjected by the applicant to a test on a so-called high- 
level classification problem, in which it was necessary 
to distinguish photographs from graphics and texts. In 

50 this experiment, validation of the labelling of the terminal 
nodes was not performed, and hence the rejected-im- 
ages class described above was not taken into consid- 
eration. 

[0047] In particular, the test was carried out using both 
55 the set of training images employed for the construction 
of the classifiers and a set of test images which was al- 
together unrelated to and independent of the set of train- 
ing images. 
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[0048] In detail, a database of images made up of 
4500 images coming from various sources was used. 
These Images consisted of images downloaded from 
the Web, scanned-ln images, and bit-map versions of 
electronic pages. In particular, the database of images 
included 2600 photographs, 1300 graphics, and 700 
texts. 

[0049] The various images differed in size (ranging 
from 120 X 120 to 1500 x 1500 pixels), resolution and 
depth of tone. The classes of photographs included pho- 
tographs of indoor and outdoor scenes, landscapes, 
people and things. The class of graphics included ban- 
ners, logotypes, maps, sketches, and photo-realistic 
graphics. The class of texts included, instead, digitised 
manuscript texts, black-and-white and colour texts, and 
scanned or computer-generated texts with various 
fonts. The classes of texts and graphics comprised im- 
ages, such as texts with a highly coloured background 
or only a few words in large characters, and photo-real- 
istic graphics, the classification of which may be partk:- 
utarly difficult. 

[0050] Initially, a number of explanatory trees were 
constructed using various training sets made up of 
about 1600 images (approximately 700 photographs, 
600 graphics, and 300 texts) drawn at random from the 
above-mentioned database. In al) the experiments, the 
images not included in the set of training images were 
used to form a set of test images. 
[0051] In the experiments conducted using the train- 
ing sets and the explanatory trees, the percentages of 
correct classification of the images were as follows: pho- 
tographs, 95%-97%; graphics, 91%-93%; texts, 94%- 
97%. Instead, in the experiments conducted using the 
set of test images and the explanatory trees, the per- 
centages of correct classification of the images were as 
follows: photographs, 90%-91%; graphics, 80%-85%; 
texts, 89%-91%. 

[0052] These experiments were then repeated using 
pruned trees obtained by eliminating those features 
which captured purely local characteristics, such as the 
histograms of the colours and the directions of the edg- 
es, so obtaining a set of 72 low-level features. 
[0053] In the experiments carried out using the 
pruned trees, there was a mean increase in probability 
of correct classification of 4% for the photographs and 
3% for the graphics. In particular, using the training set, 
the percentages of correct classification of the images 
increased and were the following: photographs, 97%- 
98%: graphics, 93%-95%; texts, 93%-96%. Instead, in 
the set of test images, the percentages of correct clas- 
sification of the images were the following: photographs, 
94%-95%: graphics. 84%-87%; texts. 88%-91%. 
[0054] From an examination of the characteristics of 
the method of classification provided according to the 
present invention, the advantages that this makes pos- 
sible are evident. 

[0055] In particular, it is emphasized that the surpris- 
ing results illustrated above may be achieved with a 



much smaller exploitation of computational resources 
than that necessary for the implementation of the meth- 
ods according to the known art, in that the only real com- 
putational effort is represented by the construction of the 
5 tree-Structured classifier, which occurs only once and 
outside of the flow of execution in the phase of use of 
the method. 

[0056] In addition, the present classification method 
is highly optimizable and modular, lends itself to an im- 

10 piementation through parallel architectural structures, 
and is extremely "robust" in so far as the use of a tree- 
structured classifier eliminates entirely the possibility of 
not taking into consideration particular cases that might 
arise in images. 

15 [0057] Finally, it is clear that numerous variations and 
modifications may be made to the classification method 
described and illustrated herein, without thereby depart- 
ing from the protection scope of the present invention, 
as defined by the 9laims. 

20 

Claims 

1. A content-based digital-image classification meth- 
25 od, characterized by comprising the steps of: 

defining a set of low-level features describing 
the semantic content of an image, said features 
being quantities obtainable from the image by 
30 means of logico-mathematical expressions that 

. are known beforehand, and the choice of said 
features depending upon the image classes 
used for the classification; 
splitting the feature space defined by the said 
35 features into a finite number of classification re- 

gions, to each one of said regions there being 
associated a respective image class, and each 
of said classification regions being the locus of 
the points of said feature space defined by a 
40 finite set of conditions laid on at least one com- 

ponent of said feature vector; 
indexing an image to be classified, to extract 
therefrom a vector of features the components 
of which consist of the values assumed, in said 
45 image, by said low-level features; 

identifying, among said classification regions, 
a specific classification region containing said 
feature vector; and 

identifying the image class associated to said 
50 specific classification region. 

2. The classification method according to claim 1, 
characterized in that said features of said set are 
chosen among the group comprising: 

55 

a) the colour histogram in the 64-colour quan- 
tized HSV colour space; 

b) the colour coherence vectors in the 64-coiour 



35 



40 



45 
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quantized HSV colour space; 

c) the 1 1 -colour quantized colour transition his- 
togram In the HSV colour space; 

d) the moments of inertia of colour distribution 

in the non-quantized HSV colour space; 5 

e) the moments of inertia and the kurtosis of the 
luminance of the image; 

f) the percentage of non-coloured pixels in the 
image; 

g) the number of colours of the image in the io 
64-colour quantized HSV colour space; 

h) the statistical information on the edges of the 
image extracted by means of Canny's algo- 
rithm; in particular: 

15 

hi) the percentage of low, medium and 
high contrast edge pixels in the image; 
h2) the parametric thresholds on the gradi- 
ent strength corresponding to medium and 
high-contrast edges; 20 
hS) the number of connected regions iden- 
tified by closed high-contrast contours; and 
h4) the percentage of medium-contrast 
edge pixels connected to high-contrast 
edges; 25 

i) the histogram of the directions of the edges 
extracted by means of the Canny's edge detec- 
tor 

J) the mean value and the variance of the abso- 30 
lute values of the coefficients of the subimages 
of the first three levels of the multi-resolution 
Daubechies wavelet transform of the lumi- 
nance of the image; 

k) the estimation of the texture characteristics 35 
of the image based on the neighbourhood grey- 
tone difference matrix (NGTDM), in particular 
coarseness, contrast, busyness, complexity, 
and strength; 

I) the spatial-chromatic histogram of the colour 40 
regions identified by means of the 11 -colour 
quantization process in the HSV colour space, 
and in particular: 

11 ) the co-ordinates of the centroid of the 
colours; and 

12) the dispersion of the colour regions with 
respect to their centroids; 

m) the spatial composition of the colour regions so 
identified by means of the 11 -colour quantiza- 
tion process, and in particular: 
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y-axis. 

3. The classification method according to claim 1 or 2, 
characterized in that said step of splitting said fea- 
ture space comprises the step of: 

constructing a tree-structured classifier, by re- 
cursively partitioning said feature space ac- 
cording to a pre-set partition criterion. 

4. The classification method according to claim 3, 
characterized in that said classifier is a binary tree- 
structured classifier. 

5. The classifbation method according to claim 3 or 4, 
characterized in that said tree-structured classifier 
is constructed using the Cart methodology. 

6. The classification method according to any of 
claims 3-5, characterized in that said step of con- 
structing a tree-structured classifier comprises the 
steps of: 

defining a set of training images which compris- 
es, for each image class, a plurality of images 
having different characteristics; 
indexing each of said training images to extract, 
from each one of them, a respective said fea- 
ture vector; and 

constructing said tree-structured classifier, 
starting from the image vectors extracted from 
said training images and from said pre-set par- 
tition criterion. 

7. The classification method according to any of 
'claims 3-6, characterized in that said partition crite- 
rion is such as to render the two descendant nodes 
deriving from the splitting of a parent node more in- 
ternaiiy homogeneous in temris of types of images 
contained therein. 

8. The classification method according to any of 
claims 3-7, characterized in that said step of con- 
structing said tree-structured classifier comprises 
the steps of: 

labelling nodes of said tree-structured classifier 
according to a labelling criterion such as to min- 
imise, for each of said nodes, an expected im- 
age misclassification cost which is indicative of 
a reliability of said tree-structured classifier 
conditioned at said node; and 
validating labelling of the terminal nodes of said 
classifier. 

The classification method according to Claim 8, 
characterized in that said step of labelling nodes of 
said tree-structured classifier comprises the step of 



ml) fragmentation; 

m2) distribution of the colour regions with 55 
respect to the centre of the image; and 9. 
m3) distribution of the colour regions with 
respect to the x-axis and with respect to the 
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carrying out, at least for each one of the tenminal 
nodes of said tree-structured classifier, the follow- 
ing step: 

assigning to the temninal node a respective la- s 
bel indicative of the image class associated to 
the terminal node itself and chosen from a set 
of labels that have been predefined on the basts 
of said labelling criterion; 

10 

and at least one of the following steps: 

determining a cardinality of said temiinal node; 
and 

determining an image misclassification cost rel- '5 
ative to said terminal node, said cost being in- 
dicative of a reliability of the tree-structured 
classifier conditioned at the tenminal node; 

and characterized in that said step of validat- 20 
ing the labelling of the terminal nodes of said clas- 
sifier comprises the step of carrying out, for each 
one of said terminal nodes, at least one of the fol- 
lowing steps: 

25 

comparing the cardinality of said temninal node 
with a threshold cardinality; and 
comparing the image misclassification cost rel- 
ative to said temninal node with a threshold Im- 
. ...age misclassification cost; 30 

■ and moreover the following steps: 

modifying the label assigned to said terminal 
node, should at least one of the following con- 3S 
ditions have occurred: 

the cardinality of said temninal node has a first 
pre-set relationship with said threshold cardi- 
nality; 

the image misclassification cost relative to said 40 
terminal node has a second pre-set relationship 
with said threshold image misclassification 
cost; 

validating the label assigned to the terminal 
node in the event of neither of the above con- 
dittons having occurred. 

10. The classification method according to Claim 9, 
characterized in that said first pre-set relationship 
is defined by the condition that the cardinality of said so 
terminal node is smaller than said threshold cardi- 
nality, and in that said second pre-set relationship 
is defined by the condition that the image misclas- 
sification cost relative to said terminal node is higher 
than said threshold image misclassification cost. ss 
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